Speech Synthesis Data Collection for Visually Impaired Person

نویسندگان

  • Masayuki Ashikawa
  • Takahiro Kawamura
  • Akihiko Ohsuga
چکیده

Crowdsourcing platforms provide attractive solutions for collecting speech synthesis data for visually impaired person. However, quality control problems remain because of low-quality volunteer workers. In this paper, we propose the design of a crowdsourcing system that allows us to devise quality control methods. We introduce four worker selection methods; preprocessing filtering, real-time filtering, post-processing filtering, and guess-processing filtering. These methods include a novel approach that utilizes a collaborative filtering technique in addition to a basic approach involving initial training or use of gold-standard data. These quality control methods improved the quality of collected speech synthesis data. Moreover, we have already collected 140,000 Japanese words from 500 million web data for speech synthesis data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Analysis of Human-to-Human Dialogs and Its Application to Assist Visually-Impaired People

A prototype lunch delivery web system for visually impaired was developed based on the analysis of human (a visually impaired who wants to order a lunch) to human (a sighted person who helps the visually impaired to decide lunch by reading aloud lunch menu) dialog. Based on the analysis, a prototype system was developed, which consists of three steps: 1) rough selection, 2) selection of favorit...

متن کامل

Assessing the adequate treatment of fast speech in unit selection speech synthesis systems for the visually impaired

This paper describes work in progress concerning the adequate modeling of fast speech in unit selection speech synthesis systems – mostly having in mind blind and visually impaired users. Initially, a survey of the main phonetic characteristics of fast speech will be given. From this, certain conclusions concerning an adequate modeling of fast speech in unit selection synthesis will be drawn. S...

متن کامل

Multilingual Text-to-Speech Software Component for Dynamic Language Identification and Voice Switching

Text-to-speech synthesis is a critical feature of the applications developed for people with visual or reading disabilities. In the last years there has been an increasing interest in multilingual text-to-speech synthesis, which requires multilingual text analysis and language specific speech synthesis. In this case, the dynamic switching of the synthetic voice is needed in order to enhance the...

متن کامل

Content validity of a home-based person-environment interaction assessment tool for visually impaired adults.

Home-based assessments require in-depth analyses of daily living difficulties. No assessment tool that has been validated with visually impaired adult subjects has allowed such analysis. This research adapted a home-based person-environment interaction assessment tool designed for persons who are visually impaired. The Model of Competence, an explanatory model of the person-environment relation...

متن کامل

Speech technology for the visually impaired - the Swedish perspective

Fundamental speech research at the Department of Speech Communication & Music Acoustics, KTH, has lead to a multi-lingual text-to-speech system and a speech recognition device. Both are presently put to use by the visually impaired. To this date over five hundred text-to-speech systems are delivered, most of them to applications for the visually impaired. Some of these applications will be desc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014